Supplementary Material for Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators
نویسندگان
چکیده
This is the supplementary material for “Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators” (Muis and Lu, 2017). This material explains in more depth the issue of spurious structures and also the experiments settings. 1 Details on Spurious Structures About mention hypergraph, we remarked in Section 3.1 that the normalization term calculated by the forward-backward algorithm includes spurious structures, which are structures that are not part of the true normalization term. This section shows in more details how this is the case using some examples. Consider the simplified mention hypergraph as shown in Figure 1 (top left) consisting of three words and where the possible edges have been restricted to what are shown in the figure. Also, let A, B, C, D, E, F respectively denote the edges T1 → (I1), I0 → (I1), I1 → (I2), I1 → (X), I1 → (I2,X), and I2 → (X) as shown in Figure 1 (left). Further assume that features are only defined on these labeled edges. Recall that in graphical models, any prediction by the model forms a (hyper-)path from the root node (here A0) to the leaf node (X), which means each node other than the leaf node has exactly one outgoing (hyper-)edge. Now, notice that there are only three possible paths here, one for each of the three (hyper-)edges coming out from the node I1 associated with the word “Apache”. See the top right, bottom left, and bottom right of Figure 1 for the visualization. Now recall that in mention hypergraph, each node is assigned a certain set of mention combinations which the node represents, as defined A0 A1
منابع مشابه
Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators
In this paper, we propose a new model that is capable of recognizing overlapping mentions. We introduce a novel notion of mention separators that can be effectively used to capture how mentions overlap with one another. On top of a novel multigraph representation that we introduce, we show that efficient and exact inference can still be performed. We present some theoretical analysis on the dif...
متن کاملA Joint Framework for Coreference Resolution and Mention Head Detection
In coreference resolution, a fair amount of research treats mention detection as a preprocessed step and focuses on developing algorithms for clustering coreferred mentions. However, there are significant gaps between the performance on gold mentions and the performance on the real problem, when mentions are predicted from raw text via an imperfect Mention Detection (MD) module. Motivated by th...
متن کاملVariability in word duration as a function of probability, speech style, and prosody.
This article examines how probability (lexical frequency and previous mention), speech style, and prosody affect word duration, and how these factors interact. Participants read controlled materials in clear and plain speech styles. As expected, more probable words (higher frequencies and clear speech second mentions) were significantly shorter than less probable words, and lexical frequency wo...
متن کاملJoint Mention Extraction and Classification with Mention Hypergraphs
We present a novel model for the task of joint mention extraction and classification. Unlike existing approaches, our model is able to effectively capture overlapping mentions with unbounded lengths. The model is highly scalable, with a time complexity that is linear in the number of words in the input sentence and linear in the number of possible mention classes. Our model can be extended to a...
متن کاملRecognizing Biomedical Named Entities Using Skip-Chain Conditional Random Fields
Linear-chain Conditional Random Fields (CRF) has been applied to perform the Named Entity Recognition (NER) task in many biomedical text mining and information extraction systems. However, the linear-chain CRF cannot capture long distance dependency, which is very common in the biomedical literature. In this paper, we propose a novel study of capturing such long distance dependency by defining ...
متن کامل